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5 PERSONAL AUDIO MESSAGE PROCESSOR AND METHOD 



Field of the Invention 

The present invention relates generally to dictation 

10 and audio communication devices and, more particularly, 
concerns a method and portable apparatus for audio 
communication, including the recording and editing of voice 
mail and audio content and its transmission and reception over 
a private or public network, such as the Internet, using common 

15 electrical communication media or data links. 

Background of the Invention 

All electronic message systems, with the exception 
of voice-mail, have intermediate devices or storage media 

20 whereby data may be transferred, preferably at a high 
transmission rate, over a standard communication link and 
stored in a storage medium or onto an unattended device for 
later off-line access, review and editing by the intended user. 

In the case of a facsimile transmission, an image is 

25 scanned by the transmitter and then transmitted and ultimately 
printed at a remote site for off-line utilization by the 
intended receiver. In the case of electronic mail, data is 
generated on a computer and then transmitted and stored either 
directly on the intended user's unattended computer or on a 

30 central host computer linked to a network of computers for 
subsequent retrieval by the intended user. The most common 
networks are Local Area Networks (LAN) , a Wide Area Networks 
(WAN), and public networks, such as the Internet, or private 
networks. When the intended user accesses his computer, either 

35 the E-mail is already resident, or he finds a message displayed 
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in a graphic editor indicating that he has mail and how he can 
retrieve it. Once the E-mail is retrieved, it likewise may be 
read, reviewed and manipulated by the intended user off-line 
on the users' computer. Alternately, it may be outputted to 
a printer, providing the user a hard copy for review at his 
convenience . 

When a facsimile machine is unavailable, a facsimile 
may be transmitted to - a computer or handheld, paperless fax 
machine for off-line and independent review by the recipient, 
such as Reflection Technology, Inc. 's FaxView personal fax 
reader. 

Utilities exist for both facsimile and E-mail 
messages, whereby messages may be selected from a host by an 
authorized user for subsequent t ransmission to the user's E- 
mail address or unattended facsimile machine. See, for 
example, Duehren et al . , U.S. Patent No. 4,918,722. 

Recently, with the widespread and growing usage of 
the Internet and, more particularly, with the growing 
popularity of WEB sites offering published material in the form 
of HTML (Hyper Text Markup Language) documents, utilities have 
been created which permit such files to be selected for 
subsequent off-line access and independent review by fax. See, 
for example, FactsLine for the Web, by Ibex Technologies, Inc. 
Such a utility makes the large volume of information and 
graphics offered over the Internet, available to users who 
either do not have access to a computer connected to the 
Internet, or wish to limit the amount of time spent on-line. 

A large percentage of potential users do not have 
access to the Internet, or even if they do; may be traveling; 
may not have access to their computers; or may not wish to 
spend time booting their computer and waiting for Web site 
graphics (utilities such as Web-On Call Voice Browser by 
Netphonic Communications, Inc. have been introduced which 
permit users to access the Internet, in response to voice 
prompts) , to navigate to a document or E-mail of interest, to 
identify a document by number and to have a selected document 
read in real-time over the phone using text synthesizing voice 
and faxed back or sent as an e-mail attachment. 

Similarly the widespread use of the Internet and 
heavy traffic to particularly popular Web sites or during 
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Server) . Systems such as this represent a real breakthrough, 
since in the past, delivery of audio by conventional on-line 
methods downloaded it at such low rates that acquiring the 
information took five times as long as the actual program. 
5 _ This required the listener to wait 25 minutes before listening, 
to 5 minutes of audio. 

As a result of the availability of streaming audio 

over 

the Internet, a number of companies have introduced Internet 
10 telephone products which permit users having multimedia 
computers 

programmed with proprietary software to talk in real time over 
the Internet (see Voclatec) . Such a system is useful over long 
distances when users can access a local Internet access point 
15 or 

point of presence, making a long distance call into a local 
call . 

Similarly, as a result of streaming audio over the 
Internet, content providers are able to broadcast live audio 
20 from 

a Web site (e.g. AudioNet by Cameron Audio Networks) . 

Recently a standard-based implementation for 
communication over the Internet has been introduced, and 
supported by Intel and Microsoft, which makes use of the DSP 

25 Group's TrueSpeech G.72 3 compression technology. This uses an 
advanced algorithm that results in excellent voice quality, 
despite a high compression ratio, and operates at 6.3 kilo bits 
per second (kbps)and 5.3 kbps with compression ratios of 20:1 
and 24:1, respectively. It also includes silence compression 

30 which can bring the effective rate down to less than 3.7 kbps 
at 28.8 kbps modem speed. This would permit the transmission 
of audio at a rate of 1:7.78 or 10 minutes of audio in 1.3 
minutes . 

Using Texas Instrument's C80 DSP chip using a V.34 
35 modem running at 28.8 kbps, a transmission rate of audio at a 
rate of 10:1 (ten minutes of speech in 1 minute of 
transmission) can be achieved with telephone grade sound 
quality. 
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From the above, it is apparent that while the 
transfer of data, graphics and audio messaging and content over 
a network has become more widespread and convenient, this 
growth has also highlighted certain historic shortcomings 
5. associated with the transfer and input/output of voice 
messaging and audio content. As voice messaging and audio 
content become more available, the deficiency created by the 
lack of an intermediate device or storage medium for such audio 
will become more pronounced. 
10 For both E-mail and facsimile, use of a telephone 

link 

is limited to the transmission of the data and the transmission 
of control codes for that data. With the growth and widespread 
usage of network computing, the telephone link for e-mail and 

15 facsimile (e.g. PASSaFAX from RADLinx) is further limited to 
a hook-up to a local point of presence to access the network. 
Both e-mail and facsimile contain content which may be 
outputted by the intended user to a printer, which permits the 
* user to take a hard copy of the material with him for review 

20 at his convenience, while he is away from his office or 
traveling . 

In sharp contrast, voice messages and voice-text are 
currently recorded by the sender and retrieved by the intended 
recipient primarily in real-time and on-line. At best, a user 

25 can use his multimedia notebook computer to record and access 
a stored audio file or streaming voice file. Off-line access 
to audio is limited to downloading audio files onto a 
multimedia computer and having the sound card equipped computer 
play the audio. However, a multimedia computer, with its 

30 screen, keyboard and multipurpose processing capability, is 
hardly the size of a traditional dictation device or voice 
recorder. This dependence on a telephone hand set or 
multimedia computer to create and access audio is analogous to 
requiring a recipient of a facsimile to view, edit and prepare 

35 a facsimile only while in close proximity to a facsimile 
machine or fax enabled computer. Not being able to prepare, 
review and access network based voice mail other than in real- 
time from a telephone hand set or off-line from a multimedia 
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computer, severely limits the desirability of integrating voice 
messaging and audio content into network based messaging. 
There exist no dedicated and portable devices to store network 
based voice messaging and likewise there exists no method or 
utility to scan and select personal voice messages or public 
announcements from a host connected to a network for subsequent 
high speed transmission to a device for subsequent off-line 
review by the user. 

The only dedicated device which permits the user to 
review his/her voice messages off-line is the Telephone 
Answering Device (TAD) which is primarily a residential or 
small-office, home-office (SOHO) appliance which uses digital 
recording technologies to replace the standard functions of a 
traditional tape-based answering machine. The TAD, plugged 
into both an electrical outlet and phone jack is not portable, 
so the user must either be within hearing distance of the TAD's 
speaker or, using a telephone, may call in to retrieve his/her 
messages on-line and in real-time. While traditionally, TAD's 
have offered very limited outbound messaging 'capabilities; 
whatever outbound messaging was offered required that the owner 
record any outbound message (e.g. a general greeting or caller- 
specific/mail box-specific message) either from within range 
of the microphone on the TAD or from a real-time telephone 
call. 

Voice messaging, whether network based or TAD based, 
limited to on-line and real-time transmission and physically 
requiring access to a telephone set, TAD or multimedia computer 
is unfortunate, particularly because voice communication 
inherently does not require any external hardware or 
instrumentation other than the mouth and ear for a human being 
to create or access it. Speech is the most natural and self- 
sufficient form of communication. Speech is hands -free 
requiring neither writing instrument, keyboard, screen, 
dedicated vision or hand-to-eye coordination on the part of the 
user to input or retrieve. That voice mail is nonetheless so 
widely used is more a function of speech's unique 
characteristics than a vote of approval on the adequacy of the 
current technology. Similarly, that so many innovative 
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utilities have been introduced which make audio and voice 
available over public and private networks is a commentary on 
the compelling nature of audio and voice for content, messaging 
and issuing commands and only underscores the need to make 
5_ audio and voice more easily available. Until such time that 
voice messaging and audio content are made more accessible, 
many of the network based audio utilities mentioned above will 
remain novelties for technophiles . 

Much has been said about Computer Telephone 
10 Integration 

(CTI) and the Universal Mail Box, where network based messages 
and content may originate in any medium and by any input device 
of choice and, likewise, may be retrieved in any medium or by 
any output device of choice . Faxes can be accessed as data on 
15 a computer screen, data can be accessed as a fax or text-to- 
speech audio-text and, as automatic speech transcription 
utilities become more capable, audio will be accessed as 
printed text in email or fax. However, as long as audio does 
not have an input /output device of choice other than a 

2 0 telephone handset or screen/keyboard based multimedia computer, 

its desirability as a medium of choice will likewise be 
severely limited. 

Since speech is a direct record of the user's 

voice, 

25 the urgency, meaning and emotional content is never lost. 
Similarly, since so much data is first generated in voice and 
is only later transcribed to text or data, info-text should be 
the preferred medium for timely data on meetings, speeches and 
radio broadcasts. Ideally, voice mail should be the preferred 

3 0 mode of communications when traveling, when communicating 

through time- zones and when accessing timely information which 
originated in the spoken word (e.g. minutes of a meeting or 
lecture) . Voice text (i.e. data or text which is spoken by a 
computer or pre-recorded by a- human) should be the preferred 
35 format for messaging information to be accessed where use of 
motor skills and vision are not convenient or are impaired such 
as when driving, operating equipment or engaged in a 
leisure activity. 
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The current use of a telephone to access voice 

messages 

directly has significantly limited the potential utilization 
of voice messaging. Real-time transmission of voice messages 
and info- text makes the recording and retrieval of voice mail, 
especially from long distances, very costly. The cost and 
inconvenience involved means that one cannot compose and review 
voice mail and info-text in a cost efficient manner and at 
one 1 s 

own pace. One is limited to a location and situation in which 
a 

telephone is accessible and, in the case of a wireless 
communication link, to a place where wireless transmission is 
both possible and desirable. 

The application of multimedia computers to compose 

and 

review voice mail has had little effect on making voice 
messaging 

more convenient* since the use of keyboards, pointing devices 
and 

screens is hardly hands -free, nor is the size and expense of 
a multimedia computer conducive to widespread use and 
transportability. In its present state, voice mail is limited 
to short messages between individuals wishing to communicate 
in a more substantive fashion at another time (telephone tag) . 
Voice "mail" becomes limited to voice "messaging" because of 
the cost and inconvenience to both the sender and receiver of 
listening to lengthy, content-rich "mail" over the phone or at 
a multimedia computer" . Furthermore, the cost of transmitting 
audio signals in real-time, through a direct communication link 
to the user's voice processor or TAD, and only when the user 
has access to a telephone (as opposed to un-attended recording 
at off-peak hours) make more commercial use of info text 
(recorded instructions, recorded travelogues, speech 
transcripts, article or books on "tape" etc.) and other 
innovative advertiser/subscriber supported uses of voice-text 
unfeasible . 
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Recently, U.S. Pat. No. 5,444,768, issued to Charles 
Lamer et al . , and assigned to International Business Machines 
Corp., and U.S. Pat. No. 5,359,698, issued to Shmuel Goldberg 
et al. and assigned to Espro Engineering, both disclose a 
5, portable computer device for audible processing of audio 
messages stored at one or more remote central message 
facilities. The Lamer et al. system permits the user to record 
and playback, transmit (upload) and receive (download) voice 
messages from a central message facility and over a 

10 communication link and onto a portable device; however, the 
Lamer et al . system requires that a direct telephonic link be 
established between the portable device and one or more remote 
central message facilities. The Lamer et al. and Goldberg et 
al. systems enable the portable device to individually access 

15 a traditional, closed, expensive, proprietary voice processing 
system through a direct communication link. The Lamer et al . 
and Goldberg et al . systems do not provide a commercially 
feasible solution for accessing voice mail other than by way 
of a long distance call to a central message facility. The 

20 expense associated with such a long distance toll charge would 
make extended usage of the Lamer et al . system prohibitive. 
In addition, the Lamer et al . system requires that a user 
contact one or more remote central message facilities to 
retrieve and transmit selected audio files. The inconvenience 

25 associated with such a polling procedure nullifies the 
convenience provided by the system. 

Similarly, the Lamer et al . system does not provide 
for a method by which the user may browse available audio 
content nor for a method to select audio files from a menu for 

3 0 subsequent retrieval by the portable computer device. 
Similarly, the Lamer et al . system does not provide for a 
utility whereby the user may remotely access a central server 
linked to a network of servers to download control code, search 
a personal user group or public database for an address other 

35 than by way of initiating a dedicated "training" mode by either 
coupling the portable computer device directly to a computer 
or by way of detecting and recording DTMF tones generated 
locally by a standard touch-tone telephone device. Since a 
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typical user's mail box utilities are handled on his network 
e-mail server and modified regularly in the course of his 
sending and receiving e-mail, such a dedicated training session 
for the portable computer device is impractical. Similarly, 
5_ since new audio server platforms, utilities and compression 
schemes are being introduced regularly, there is a need for a 
dynamic and transparent method for updating both control codes 
and address books without the need for a dedicated training 
session . 

10 Broadly, it is an object of the present invention to 

provide an Internet-ready dictation and voice message 
recording/reviewing device and method which enable a user to 
compose and review voice mail off-line, from any location, 
while engaged in any activity, at a leisurely pace, without 

15 incurring telephone toll charges and whether a communication 
link is presently accessible or not. 

It is also an object of the present invention to use 

a 

telephone link preferably to a l6cal network access point 

20 primarily as a communications link for high speed transmission 
of pre-recorded material and control codes to facilitate that 
transmission, thereby limiting the use of a telephone or a 
multimedia computer and telephone line for voice messaging as 
a recording or playback device. 

25 It is also an object of the present invention to 

provide a protocol whereby pre -message handshaking occurs 
between a dictation and voice message recording/reviewing 
device and a network server to conform the digitized voice 
signal to one of the standard voice compression protocols and 

30 TCP/IP protocol stacks to facilitate a high speed transmission 
of voice messages over the network. 

It is another object of the present invention to 
provide a portable and dedicated voice capable network 
(Internet) access device which enables the user to record, edit 

3 5 and play audio files which may be transmitted and/or received 
over a public or private network. 

It is also an object of the present invention to 
provide a portable access device and method which permit the 
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owner of a specially modem- configured Telephone Answering 
Device (M-TAD) to access and download compressed voice message 
files directly from the TAD 1 s digital memory onto a portable 
voice message record/playback device either, by way of a direct 
5„ cable connection to the TAD or by a telephone link. 

Providing such a portable access device and method 
would permit TAD owners to encourage inbound callers to leave 
more robust and data-rich audio messages on their TAD as well 
as permit TAD owners to subscribe to audio content, which could 

10 be regularly delivered to their TAD in compressed digital form 
and downloaded onto the present invention for play-back and 
review at a convenient time and place. This would also permit 
TAD owners, while away from their home or office to have their 
portable dictation and voice message recording/reviewing device 

15 establish a telephone link with their TAD and economically and 
automatically retrieve all stored messages and update all 
outgoing messages (e.g. general and caller specific greetings) , 
with all stored messages and outbound greetings being 
transmitted in digitized and compressed format. 

20 The invention provides a low cost, portable recording 

and playback dictation and voice message recording/reviewing 
device which permits the user to record, edit, play and review 
voice messages including audio- text, text-to-speech and other 
audio material which may be received from and subsequently 

25 transmitted to a remote host computer located on a public or 
private network over a communication link such as the public 
switched telephone system. 

A preferred device contains its own rechargeable 
power source, integrated circuitry and control buttons to 

30 permit the localized recording, editing, storage, playback and 
transcription of audio signals through a built-in speaker, 
microphone or plug- in headset, foot pedal and removable memory 
card. The device also contains a standard RJ-11 telephone 
jack, modem chip set (or software) , or a removable PCMCIA 

35 connector to which a standard or wireless modem card could be 
connected; and a DTMF tone decoder to permit the transmission 
and control of audio signals to and from a host computer 
connected to a public or private network. The device contains 
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circuitry which permits it to transmit and receive audio 
signals at a rate substantially faster than originally 
recorded . 

A preferred device also contains. a processor which 
5_ includes the necessary terminal emulation to permit a network 
user to access a network directly from a local point of access, 
such as an Internet service provider's (ISP) point of access 
and shell account, using a standard protocol such as SMTP 
(Simple Mail Transfer Protocol), Post Office Protocol (P0P3) 

10 and MIME (Multipurpose Internet Mail Extensions) in the TCP/IP 
suite to review, select and retrieve audio files that have been 
sent to the user's e-mail address (or similarly, data/text 
files which can be translated into voice) , and to download and 
transmit such files. 

15 A preferred device also contains a standard or 

touchscreen display and software which permits the user to 
display a similar graphical editor for composing and reading 
e-mail messages as is displayed on his computer screen when 
accessing his e-mail, so that the user can scroll through his 

20 e-mail messages, selecting those audio files he wishes to 
download and selecting text messages he wishes to have 
converted, either by the network server or at the device, into 
an audio format (text-to-speech) . 

A preferred device also contains: a cradle into which 

2 5 the device may be placed, the cradle having ports which enable 

it to be connected to a power source to recharge the device's 
batteries; a phone jack to enable it to establish a 
communication link; and a serial or parallel port on a computer 
for downloading and uploading files directly to the computer 

3 0 or for receiving "redirected" files. 

A preferred device also contains a language user 
interface capable of recognizing and responding to speech with 
speech. Such an interface includes speaker independent 
functions but also permits speaker adaptation which allows the 
35 personal device to adjust to the peculiarities of the user's 
voice or pronunciations and thus improve accuracy. This 
speaker adaptation is achieved through a protocol which allows 
the system to adapt to the users voice through the repetition 
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of a set of sentences, prior to first use of the device (See 
Lernout & Hauspie Speech Product's [LHSP] asrlOOO product 
line) . The language interface includes a vocabulary builder 
which permits the user to extend the vocabulary including 
special terms and proper nouns to the speech recognition 
application (see LHSP Lextool™) , a user template which enables 
the user to create words which the device will associate with 
user defined commands e.g. "home" could be associated with an 
e-mail address (LHSP asr 200 product line) , alphabet 
recognition for spelling an e-mail address as well as 
background noise tolerance and speech at a distance software 
which improve the accuracy of the language user interface even 
in an automobile, airplane or public place and even if the user 
is not wearing a headset. (see LHSP) 

A preferred device also contains public-key 
encryption technology designed to ensure reliable and secure 
transmission of sensitive information by encrypting and 
decrypting the message data and by authenticating the sender's 
identity by using a secure digital or voice signature. 

A preferred device also contains a text-to-speech 
utility which permits the user to download data not already 
converted to speech by a network server and to do so at the 
device . 

A preferred device also contains a bar code reader 
which permits the user to scan a printed bar code associated 
with printed matter such as a news article, a map, a menu of 
available audio files or in a travel guide which would give the 
device all the information it needs including network server 
address, file location and file ID so that the audio file 
associated with the printed matter could be automatically 
retrieved from a network such as the Internet. 

A preferred device also contains a bar code reader 
which permits the user to scan a printed bar code associated 
with printed matter such as a news article, a map, a menu of 
available audio files or in a travel guide which would give the 
device all the information it needs to play a file from a 
previously retrieved group of audio files (such as described 
in Goldberg et al . ) . 
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A preferred device also contains an Infrared 
interface using a standard such as the Infrared Data 
Association ( I r DA ) for 

high speed local wireless transmission (e.g. 1.2 Mbps and 
. 4Mbps) of audio files and control codes between the device and 
a public 

phone, kiosk or the users' computer. 

A preferred device also includes a software utility 
called an off-line browser which programs the device to 
automatically retrieve audio files from the network during off- 
peak hours to which the user has subscribed, or from 
selected Web sites which have new audio material available, or 
from e-mail addresses that the user has programmed the off-line 
browser to retrieve. 

A preferred device also includes a software utility 
which enables the user, by way of a graphical screen based 
interface or by way of audio prompts, to browse either network 
databases such as those located on the Internet for addresses 
and/or sites from which to receive and send audio files. 

A preferred device also includes a software utility 
which creates a graphic interface and memory for the user to 
access, refresh and/or download his E-mail address book 
containing the E-mail addresses of individuals and groups for 
which he may wish to prepare and to which he may wish to send 
audio files. Such a utility would automatically synchronize 
the data in the dictation and voice message recording/reviewing 
device to the data contained in the user's E-mail server 
account . 

A preferred device also includes a software utility 
which creates a graphic interface and memory for the user to 
organize his/her telephone numbers, E-mail addresses, calendar, 
reminders and appointments including a clock and alarm function 
with an option to choose between a simple audible sound alarm 
or a programmed voice message alarm {e.g. "call home"). 

A preferred device also includes a software utility 
which enables the user to download proprietary client server 
software systems and upgrades and newly introduced standards 
for low bit rate speech compression made available over a 
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public or private network such as the Internet to insure that 
the device may use the latest state-of-the-art audio 
compression software . 

A preferred device also includes a software utility 
5. which enables the user to download proprietary client server 
software systems and upgrades and newly introduced standards 
which enable the device to receive highly compressed and/or 
streaming audio files containing voice content including, but 
not limited to application program interfaces (APIs) which 

10 enable the device to be used as a portable Internet Phone 
appliance to conduct a real-time, two-way, full-duplex voice 
conversation using a local connection to the Internet. 

A preferred device also includes a software utility 
which extends the functionality of a Web program run from a Web 

15 browser and operate on data such as audio data as it flows in 
the user's PC, permitting the user to redirect audio files by 
the communication port directly to the device seated in a 
cradle and connected to the serial or parallel port. 
Alternatively, this could be achieved though OLE (Object 

2 0 Linking and Embedding) enabled Web software which when 

activated by the user by pressing a designated key such as 
print, redirects audio files directly to a special "printer" 
driver dedicated for the device. The utility permits users who 
are browsing the Web on their computers to download audio files 
25 directly to their personal audio servers for later access, 
without having to transfer from their hard disc. 

A preferred device also includes a software utility 
which enables the user to select E-mail messages and request 
that the messages be converted from text-to-speech by an 

3 0 appropriate text-to-speech conversion application available to 

the network, and only subsequently digitized and transmitted 
as digitized and compressed audio file. 

The invention also relates to a method and software 
utility using DSVD (Digital Simultaneous Voice/Data) and/or the 
35 VoiceView protocols (Radish Communications Systems, Inc.) which 
would enable the user, once connected to a communication link 
to be able to transfer and receive audio files directly into 
a dictation and voice message recorder Device simultaneously 
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or, alternatively, with the user processing, and/or receiving 
and transmitting other related or unrelated data to and from 
the network or conversely, while the user is talking on the 
phone. The use of these voice/data protocols would permit the 
- dictation and voice message recording/reviewing device user to 
request audio files in response to voice prompts spoken in 
digitized streaming or analog voice, to respond by spoken 
responses, keypad entries or DTMF tones and to transfer those 
files in high speed data mode during the same phone connection. 

The invention also relates to a method and software 
utility which permit the scalability of digitized audio files 
in order to conform with network server requirements and or 
user preferences. This would enable the server to demand or 
the user to request a lower compression rate or slower 
transmission speed in order to have higher fidelity for the 
audio file requested, and vice versa. 

It is a feature of the present invention that a 
recording device may be left connected to a communication link 
and programmed to dial into and to connect to a local network 
access point at off-peak hours when telephone rates are lowest 
and when excess capacity on incoming lines is available. The 
recording device is programmed to search the network for audio 
files to which the user has a subscription, new audio files 
available from Web sites to which the user has programmed the 
device to look, and for audio mail sent to the user from 
selected E-mail addresses. 

It is a feature of the present invention that an 
interface port, such as a standard RJ-11 telephone jack, is 
provided so that the recording device may be connected between 
a telephone set, computer, cellular phone or personal digital 
assistant and a communication link to enable the user to select 
and retrieve voice files while using any of the above devices. 

It is also a feature of the present invention that 
circuitry is provided for the digital conversion and 
compression of the analog voice signals recorded in the memory 
of a dictation and voice message recording/reviewing device to 
permit high density storage and high speed transmission of 
digitized voice. Similarly, circuitry is provided for the 
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analog conversion and. natural sounding playback of previously 
stored or received digitized voice. 

It is also a feature of the present invention that 
there may be provided a public terminal, e.g. in a manner 
similar 

to an automated teller machine and located at places such as 
airports and tourist sites where a user could connect his 
recording/reviewing device and select voice messages and audio- 
text to be retrieved and transmitted directly by the 
recording/reviewing device. 

Brief Description of the Drawing 

The foregoing, as well as the other objects, 

features 

and advantages of the present invention will be understood more 
completely from the following detailed description of a 
preferred 

embodiment, with reference being had to the accompanying 

drawing, 

in which: 

Figure 1 is a schematic block diagram of a preferred 
personal audio message processor embodying the present 
invention; and 

Figures 2-7 (Figure 2 comprises Figures 2a and 2b) 
are flow charts illustrating how certain processing is 
performed in the apparatus of Fig. 1. 

Detailed Description 

Figure 1 is a schematic block diagram of a presently 
preferred Personal Voice Server (PVS) system 10 embodying the 
present invention. PVS system 10 broadly comprises five main 
parts: a highly integrated DSP/RISC integrated chip 11 (DSP 
stands for Digital' Signal Processor and RISC stands for Reduced 
Instruction Set Computer); a Telecom/Audio Codec 17; a memory 
such as SDRAM 12 and/or Flash Memory 13 coupled to the DSP 
chip; peripherals such as a microphone 26, a speaker 18, a 
touchscreen/display LCD 19, an infrared I/O 21 and a Barcode 
reader 15. Operating system software is also provided to 
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manage the DSP to handle modem routines such as V32bis, V34 
etc., voice recognition, echo cancellation and speech 
synthesis; software also controls the system via the RISC part 
of chip 11. Although the embodying device. 10 is referred to 
. as a voice server, it should be clear that it is equally useful 
for other types- of audio, including music. 

The DSP chip is preferably a Philips Semiconductor 
PR31100 chip, which contains a MIPS R3000 RISC CPU core with 
4 Kbytes of instruction cache and 1 Kyte of data cache, plus 
various integrated functions for interfacing to numerous system 
components and external i/o modules. The chip also has a 
hardware multiply/accumulate unit to perform DSP functions, 
such as a software fax/modem which eliminates the need for an 
external modem chip set. However the chip also has a UART 
(Universal Asynchronous Receive Transmit) interface 22 (shown 
separately) , which permits the device to be connected to an 
external modem or other device (such as a modem equipped 
Telephone Answering Device) through a conventional RS232 serial 
connector 23 . 

The PR31100 also contains multiple DMA (direct memory 
access) channels and a high-performance, flexible Bus Interface 
Unit (BIU) for providing an efficient means for transferring 
data between external system memory, cache memory, the CPU 
core, and external I/O modules. The PR31100 also contains a 
System Interface Module (SIM) , which provides integrated 
functions for interfacing to various external I/O modules, such 
as a liquid crystal display (LCD) 19, an infrared I/O module 
21, and the Codec 17. 

Codec 17 is preferably a Philips UCB 1100 single chip 
integrated mixed signal audio and telecom codec, which handles 
most of the analog functions of the system, including the sound 
and telecommunications codec (analog/digital coding and 
decoding) functions and touchscreen analog-to-digital 
conversion, ISDN/high- speed serial, infrared, and wireless 
peripherals. The high-speed serial interface 14, although 
shown separately in Fig. 1, is actually part of the UCB1100. 
The chip has a single channel audio codec which is designed for 
direct connection of a microphone and speaker (i.e. components 



SUBSTITUTE SHEET (RULE 26) 



WO 98/47252 PCT/US98/07228 

19 

16 and 28 are actually part of the UCB1100) . The built-in 
telecommunications codec can be connected directly to a 
conventional RJ-11 jack 20 for connections to a telephone line. 

For a more complete understanding of the embodiment 
S of Fig. 1, data sheets for the PR31100 and UCB1100 are attached 
and are incorporated in this description by reference. 

The operating systems software for the PR31100 is 
preferably Eden OS version 2.0, commercially available from the 
Eden Group Limited of Cheshire, England. This operating system 

10 is specifically designed to support the PR31100 (also known as 
DINO) and the UCB1100 (also known as BETTY) . A data sheet for 
the Eden OS is attached, which describes the software support 
and the drivers provided by the operating system. This data 
sheet is incorporated in the present description by reference. 

15 Memory 12, 13 is used to store messages and to hold 

temporary data. The flash memory is configured according to 
the amount of permanent programs required, including operating 
system (O/S) and application software and also to store some 
of the recorded messages. Typically, audio compression 

20 provided in the PR31100 will result in a data bandwidth of less 
than half a Kbyte per second (i.e. 1Mbyte of memory will 
provide an hour of audio.) 

A microphone 26 and speaker 18 are selected based on 
quality and size. 

25 Flow diagrams are presented in Figs. 2 - 7 to 

describe the operation of retrieving messages over the Internet 
and transmitting them to and from the PVS as well as the 
various operational options for dialing, receiving data from 
a given server address in the Internet, storing, screening, 

30 retrieving, transmitting and playing messages to/from the PVS. 
These operations include receiving compressed messages in 
digital form and audio signals in analog form bi-directionally 
from speaker/microphone and phone connection. 

Figures 2a and 2b comprise a flow chart illustrating 

35 how the PVS connects to a location on the Internet by Transport 
Protocol and how the PVS gets all data relating to its Web/e- 
mail site (e.g. HTML language displaying information) and 
receives/stores messages (audio, data etc.) that were sent 
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using either a proprietary or de-facto standard (e.g. highly 
compressed audio at 2.5 kbps) . 

The operation depicted in Figs. 2a and 2b is run 
concurrently by the real-time kernel of the DSP/RISC (discussed 
5- further below with reference to Fig. 3) . It enables multiple 
tasks to be run and executed in Parallel. Operation of the 
main task begins at block 200. Accessing a site and storing 
or receiving stored messages is run concurrently with other 
tasks. These tasks can be local to operate the PVS, or other 

10 tasks such as the operation of the bar-code reader, voice 
synthesizer, voice recognition, or to access other Web sites 
by PPP at the same time. 

At block 202, a test is performed to determine 
whether the desired operation is connection to a network access 

15 provider via an out -bound call (at block 210) . If not, the 
modem, in response to a ring, answers the call, completes its 
handshake procedure, and begins receiving information (block 
204) . Data bits from the modem are received by DSP chip 11 at 
block 220. The DSP chip decodes the incoming data at block 

20 230. 

At block 240, a test is performed to determine 
whether the desired operation is to decode an HTML site. If 
not, control transfers to block 340. Otherwise operation 
continues at block 250, where the display of the site page 

25 begins. A test is performed at block 260 to determine whether 
the mode of operation is interactive or automatic. In the 
interactive mode, the user of the PVS has to browse and select 
the desired operation to be completed. In automatic mode, the 
keyword (s) to retrieve audio or other messages are searched for 

30 and activated automatically to get the compressed data. If the 
test at block 260 senses the interactive mode, control is 
transferred to block 110 in Fig. 2b. If not, automatic 
browsing is done starting at block 270 to search for a 
highlighted keyword symbol. At block 280, a test is performed 

3 5 to determine whether the keyword constitutes a request for a 
previously digitized message and if so, the data compressed by 
FTP protocol is received by the PVS at block 290. If the test 
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at block 280 results in a "no", control transfers to the block 
310. 

At block 310 a test is performed to determine whether 
no more messages exist, and, if so, control returns to block 
5- 100. Otherwise, a test occurs at block 320 to determine if the 
keyword constitutes a request for a place to store local 
messages at the web server. If so, this data, such as a 
compressed audio messages, is transmitted from the PVS to the 
web site (block 330) . If not, control returns to the start 

10 (block 100) . The process is continued until there are no other 
stored messages for the PVS owner at this Web site. 

At block 340, a test is performed to determine 
whether this cite is utilizing the FTP protocol language. If 
so, a message is retrieved utilizing FTP (block 360) , and it 

15 is stored at block 380 and control is transferred to block 120 
in Fig. 2b. If it is determined at block 340 that FTP protocol 
is not being used, a test is performed at block 340 to 
determine whether or not a recognized access language is being 
received. If so, a message is retrieved at block 360 utilizing 

20 the recognized access language and is then stored at block 380. 
Control is then transferred to block 120 in Fig. 2b. If a 
recognized access language is not found at block 3 50, the user 
is notified at block 370 and control returns to block 100. 

If it was determined at block 260 that the mode is 

25 interactive, control is transferred to block 110 in Fig. 2b. 
At block 112, the keywords in the web page are selected and, 
at block 114 HTML interpretation is activated to locate the 
messages in the pool. At block 116, messages are then sent 
and/or received and control is returned to block 100 in Fig. 

30 2a. 

Following block 380, where data was stored, 
preferably in compressed form, control is transferred to block 
120 in Fig. 2b. Any data which is stored causes the creation 
of data in a flat database (block 120), which may be searched 
35 to locate the data at a later time. In case the message is an 
audio message, it is decompressed and played at the same time 
that it is transmitted by FTP protocol. The test at block 122 
determines whether such action is necessary for the current 
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message and, if so, decompression and the audio synthesizer are 
activated (block 124), the database is updated to reflect that 
the message is ready to be synthesized, and control is returned 
to block 100. If the message is not to be decompressed and 
5- played, block 122 transfers control to block 128, where a test 
is performed to determine if the message is to be sent to the 
web server and, if not, control is returned to block 100. If 
the message is to be sent to the web server, it is sent by FTP 
at block 130, and the user is notified upon completion of the 

10 transfer (block 132) , after which control returns to block 100. 

Figure 3 describes the overall operation of the 
Kernel of the Eden OS as run on the RISC core CPU of the DSP 
11 for the present application. The kernel is multitasking, 
in that it can run multiple programs or tasks concurrently, 

15 with each one having its own priority and being capable of 
initiating other (child) tasks. After the Kernel initializes 
via blocks 400-420, operation starts at the idle mode at block 
4 80, where the PVS waits for events to occur, and when one 
occurs it is handled at block 430. Every program interacts 

2 0 with the operating system this way, by having its tasks 
attended to at block 430. The type of events that arise are 
either synchronous or asynchronous. At block 440, if a 
synchronous event is detected, processing of the synchronous 
events is initiated via connector 5. Otherwise, a test is 

2 5 performed at block 45 0 to detect an asynchronous event, in 
which case processing of the asynchronous events is initiated 
via connector 6. In each case, after processing is initiated 
the operating system returns to the idle mode to process other 
events. Another special event to occur is error handling at 

30 block 460. In the event that an asynchronous event is not 
detected at block 450, a test is performed at block 460 to 
detect a failure event and if there is none, the program 
returns to the idle mode. In the event of a hardware failure, 
a communications failure or a software failure, an error event 

35 is detected at block 460 and a run time handler is issued 
(block 470) and handles the event. Control then returns to the 
idle mode. The synchronous and asynchronous events identified 
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in Fig. 3 are only exemplary and it is contemplated that there 
may be others of each type. 

Figure 4 is a block diagram illustrating the routine 
performed by the controller of DSP/RISC Chip 11 when an analog 
5. audio message is to be recorded. At block 710, a test is 
performed to determine whether the incoming messages are from 
the built-in microphone. If not, control is transferred to the 
routine of Fig. 5. If so, the audio message is digitized and 
compressed (block 720) and placed in the working pool of data 
10 (block 730) . At block 740, a test is performed to determine 
whether memory was filled before an entire message was stored. 
If not, the routine is terminated, and control returns to the 
idle mode. If so, recording is disable (block 750), and the 
operator is notified, as by warning light, that the memory is 
15 full (block 760) . Control reverts to the idle mode. 

Figure 5 is a block diagram illustrating the routine 
performed to record analog audio from the telephone line. At 
block 800, a test is performed to determine whether an audio 
message being received *is from the communications link 
20 (telephone line) . If not control is transferred to the routine 
of Fig. 6. If so , the message is passed through the 
Telecom/Audio Codec 17 as audio (block 810) , and a test is 
performed at block 820 to determine whether compression is to 
be performed by the DSP/RISC Chip. If so, the message is 
-25 ' stored in local memory (block 830) , recording is stopped, and 
control is returned to the idle mode. If compression is not 
to be performed by the DSP/RISC Chip^ the message is sent to 

the Telecom/Audio Codec, which compresses it by a standard 
(ADPCM) algorithm (block 840) . The message is then sent back 
30 to the DSP/RISC 11 through its UART (block 850), and the 
DSP/RISC chip control that causes the message to be stored I 
flash memory 13 (block 860) . Control is then returned to the 
idle mode. 

Figure 6 is a block diagram of the routine performed 
35 by the Audio/Telecom Codec controller to play stored audio 
through the built-in speaker. At block 900, the operator 
selects a message from the pool of messages stored in the 
device. At block 910, a test is performed to determine whether 
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stored message to be read was originally compressed by the 
audio/telecom codec. If not, control is transferred to block 
920. If so, the message is read and decompressed using the 
audio/telecom codec (block 930) , and the decompressed message 
5_ is applied to the digital-to-analog converter (DAC) in the 
audio/telecom codec (block 940) . The message is the played via 
the built-in speaker 18 through the D/A converter and amplifier 
28 (block 950), and control is returned to the idle mode. 

If the stored message was not originally compressed 
10 by the audio/telecom codec, a test is performed at block 92 0 
to determine whether the stored message was originally 
compressed by the audio/telecom codec. If not, the user is 
notified (block 960) , and control is returned to the idle mode. 
If so, the message is read by the controller (block 970), and 
15 it is then sent to the modem to be decompressed and then 
returned from the modem to memory 13 through the UART port of 
the audio/telecom codec 17 (block 980) . Control is transferred 
to block 940, and playback is handled in the same manner as a 
message originally compressed by the Audio/telecom codec. 
20 Figure 7 is a schematic illustration of how the PVS, 

connected to its cradle may be connected to a PC (whether 
multimedia or not) or to a specially configured TAD with a 
built-in modem in order to permit a PC or TAD user (A) to send 
or receive a voice file from or to the PVS through a modem 
25 other than the telecom/audio codec of the PVS. This would 
permit a PC user to send or attach a voice file resident in the 
PVS over the PC's modem and would likewise permit the PC user 
to download a voice file received over the PC's modem directly 
to the PVS. The same configuration would permit a non- 
30 multimedia PC user (B) to play audio files by using the PVS 1 s 
multimedia capabilities to play audio files received over the 
non-multimedia PC's modem. This configuration would likewise 
permit the PC user (C) to record audio through the PVS's built- 
in microphone and transmit it through the PC's modem as files 
35 or streaming audio. Such a configuration would also permit the 
user of a PC (D) to redirect audio files directly to the PVS 
while using a standard Web browser program. Finally, a similar 
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configuration with a modem configured TAD would permit the TAD 
user to download audio messages to and from the TAD to the PVS . 

Bi-directional communication from the PC to the PVS 
is handled by a communication cable (e.g. 9 pin connector) at 
the PC and the serial RS23 2 port on the PVS and controlled by 
the asynchronous event software controlling input/output from 
the UART communication interface. 

The software at the PC handles the driver for 
sending/receiving data to/from the PC to the PVS. For sending 
data, this would be similar to a PC sending data to a fax or 
printer, and for receiving data, this is similar to a PC 
receiving data from a scanner. This driver sets all required 
parameters for the PVS such as type of operation, length and 
wait* for acknowledgment and "End of Transmission". The PC also 
handles the software to use the PVS as an attachment 
(peripheral) for receiving multimedia audio messages so that 
the speaker on the PVS will operate. The PC also handles the 
software to manage the microphone input of the PVS, and 
software to integrate with a standard Web Browsers (e.g. 
Netscape Navigator) to be fully integrated with the software 
and invoke commands to the PVS accordingly. 

The software in the PVS is part of the multitasking 
operating functions to handle Remote activation of Procedural 
Calls (RPC) controlled under the asynchronous events software 
of the PVS. 

Although preferred embodiments of the invention have 
been disclosed for illustrative purposes, those skilled in the 
art will appreciate that many additions, modifications and 
substitutions are possible without departing from the scope and 
spirit of the invention as defined in the accompanying claims. 
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Highly integrated embedded processor 



RAIPS 
PR31100 



Version 1.2 
GENERAL DESCRIPTION 

PR31100 Processor is a single-chip, low-cost, integrated embedded 
processor consisting of MIPS R3000 core and system support logic 
to Interlace with various types of devices. 

PR31100 consists of a MIPS R3000 RISC CPU with 4 KBytes of 
instruction cache memory and 1 KByte of data cache memory, plus 
integrated functions for Interfacing to numerous system components 
and external I/O modules. The R3000 RISC CPU Is also augmented 
with a multiply/accumulate module to allow integrated DSP 
functions, such as a software modem for high-performance standard 
data and fa» protocols. PR31100 also contains multiple DMA 
channels and a high-performance and flexible Bus Interface Unit 
(BIU) for providing an efficient means lor transferring data between 
external system memory, cache memory, the CPU core, and 
external I/O modules. The types of external memory devices 
supponed Include dynamic random access memory (ORAM), 
ynchronous dynamic ranoom access memory (SDRAM), static 
random access memory (SRAM), Rash memory, read-only memory 
(ROM), and expansion cards (PCMCIA and/or MagicCard). 
PR31100 also contains a System Interface Module (SIM) containing 
Integrated functions for Interfacing to numerous external I/O 
modules such as liquid crystal displays (LCDs), the UCBttOO (which 
handles most of the analog functions of the system. Including sound 
and telecom codecs and touchscreen ADC). ISDN/high-speed 
aerial, infrared, wireless peripherals, Magicbus. etc Lastfy. PR31100 
contains support for Implementation of power management, 
whereby various PR31100 internal modules and external 
subsystems can be individually (under software control) powered up 
and down. 

Figure 1 shows an External Block Diagram of PR31100. 



FEATURES 

° 32-bil R3000 RISC static CMOS CPU 
° 4 KByte instruction cache 
0 1 KByte data cache 
° Multiply/accumulator 

0 On-chip peripherals with individual power-down 

- Multi-channel OMA controller 

- Bus interface unH 

- Memory controller for ROM, Flash, RAM, DRAM, SDRAM. 
SRAM, and PCMCIA and/or MagicCard 

- Power management module 

- Video module 

- Real-time clock 32.760KH? reference 

- High-speed serial interface 

- Infrared module 

- Dual-UART 

- SPI bus 

o 3.3V supply voltage 

o 208-pin LQFP (Low profile quad flat pack) 
o 40MHz operation frequency 



ORDERING INFORMATION 



PART NUMBER 


TEMPERATURE RANGE (°C) AND PACKAGE 


FREQUENCY 
(MHz) 


DRAWING NUMBER 


PR31100ABC 


0 to 4-70. 208-pin Low Profile Quad Flat Pack 


40 


LOFP208 



1996 Aug 07 
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Highly integrated embedded processor PR31100 
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OVERVIEW 

Each of the on-chip peripherals consist of: 

BIU Module 

o System memory and PR31100 Bus Interlace Unit (BIU) 

- s up pons up to 2 banks of physical memory 

- supports self-refreshing DRAM and SDRAM 

- programmable parameters for each bank of DRAM or SDRAM 
(row/column address configuration, refresh, burst modes, etc.) 

0 programmable chip select memory access 

- 4 programmable (size, wait states, burst mode control) memory 
device and general purpose chip selects 

available for system ROM, SRAM, Rash 
available tor external port expansion registers 

- 4 programmable (watt slates, burst mode control) MagicCard or 
general purpose chip selects 

available for (future) MagicCard expansion memory 
PR3110O provioes the chip select and card detect signals 
sup pons card insertion/removal timeouts 
MagicCard requires minimal number of unique control/status 
signals per port 

o supports up to 2 identical full PCMCIA pons 

- PR31100 and UCB11 00 provide the control signals and accepts 
the status signals which conform to the PCMCIA version 2.01 
standard 

- appropriate connector keying and level-shifting buffers required 
for 3.3V versus 5V PCMCIA interface implementations 

S1U Module 

o muJU-channel 32-bit DMA controller and Syslem Interface Unit 
(SlU) 

° Independent DMA channels for video, Magicbus. SIB to/from 
UCB1100 audio/telecom codecs, high-speed serial port. IR UART, 
and general purpose UART 

o address decoding for submodules within System Interface Module 
(SIM) 

CPU Module 

° R3000 RISC central processing unit core 

- full 32-bit operation (registers, instructions, addresses) 

- 32 general purpose 32-bit registers; 32-bit program counter 

- MIPS RISC Instruction Set Architecture (ISA) supported 

o on-cnip cache 

- 4 KByte direct-mapped instruction cache (i-cache) 
physical address tag and valid bit per cache line 
programmable burst size 

instruction streaming mode supported 

- 1 KByte data cache (D-cache) 

physical address tag and valid bit per cache line 

programmable burst size 

write-through 

- cache address snoop mode supported for DMA 

- 4 -level deep write buffer 

° programmable memory protection 



- separate reaa and write protection control for kernel and user 
space 

- 8 total protectable regions available, each individually 
programmable, using breakpoint address, mask, control, and 
status registers 

- causes address exception on illegal reads or writes 
o high-speed muhipJter/accumulator 

- on-chip hardware multiplier 

- supports 16x16 or 32x32 multiplier operations, with 64~bH 
accumulator 

- existing multiply Instructions are enhanced and new mufti pry 
and add Instructions are added to R3000 Instruction set to 
Improve the performance of DSP applications 

o CPU Interlace 

- handles data bus, address bus, and control interface between 
CPU core and rest of PR31100 logic 

Clock Module 

o PR31100 supports system-wide single crystal configuration, 
besides the 32 KHz RTC XTAL (reduces cost, power, and board 
space) 

o common crystal rate divided to generate clock for CPU, video, 
sound, telecom, UARTs, eta 

o external system crystal rate is vendor-dependent 

o Independent enabling or disabling of individual docks under 
software control, for power management 

CHI Module 

o high-speed serial Concentration Highway Interface (CHI) contains 
logic for interfacing to external full-duplex serial 
tjme-o'rvision-multiplexed (TDM) communication peripherals 

o supports ISDN line interface chips and other PCM/TDM serial 
devices 

o CHI Interface is programmable (number of channels, frame rate, 
bit rate, etc.) to provide support for a variety of formats 

o supports data rates up to 4.096 Mbps 

o Independent DMA support for CHI receive and transmit 

Interrupt Module 

o contains logic for individually enabling, reading, and clearing all 
PR31100 interrupt sources 

o Interrupts generated from internal PR31100 modules or from edge 
transitions on external signal pins 

10 Module 

o contains support for reading and writing the 7 bi-directional 
general purpose IO pins and the 32 bi-directional multi-function 
IO pins 

o each IO port can gene/ate a separate positive and negative edge 
interrupt 

o independently configurable IO ports allow PR31100 to support a 
flexible and wide range of system applications and configurations 
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IR Module 

• IR consumer mode 

- allows control of consumer electronic devices such as stereos. 
TVs, VCRs. etc. 

- programmable puise parameters 

- external analog LED circuitry 

• IRDA communication mode 

- allows communication with other IRDA devices such as FAX 
machines, copiers, printers, etc, 

- supported by UART module within PR31100 

- external analog receiver preamp and LEO circuitry 

- data rate • up to 115 Kbps at 1 meter 

• IR FSK communication mode 

- supported by UART module within PR31100 

- external analog IR chlp(8) perform frequency modulation to 
generate the desired IR communication mode protocol 

- data rate « up to 36000 bps at 3 meters 

• carrier detect stale machine 

- periodically enables IR receive/ to check if a valid carrier is 
present 

Magicbus Module 

• synchronous, serial 2-wire (clock and data), half-duplex 
communications protocol 

• supports low-cost , low-power peripherals 

• supports maximum data rate of 14.75 Mbps 

• DMA support for Magicbus receive and transmit 

Power Module 

^ power-down modes for individual internal peripheral modules 

• serial (SPI port) power supply control interface supported 

• power management state machine has 4 states: RUNNING. 
DOZING, SLEEP, and COMA 

Serial Interconnect Bus (SIB) Module 

• PR31100 contains holding and shift registers to support the serial 
interface to the UCB1100 and/or other opiional codec devices 

• interface compatible with slave mode 3 of CrystaJ CS4216 codec 

• synchronous, frame-based protocol 

• PR31100 always master source of clock and frame frequency and 
phase; programmable clock frequency 

• each SIB frame consists of 128 clock cycles, further divided into 2 
subframes or words of 64 bits each (supports up to 2 devices 
simultaneously) 



• independent OMA support for audio receive and transmit, telecom 
receive and transmit 

• supports 8-bit or 16-bit mono telecom formats 

• supports 8-bit or 1&~bii mono or stereo audio formats 

• independently programmable auaio and telecom sample rates 

• CPU read/write registers for subframe control and status 

System Peripheral Interface (SPI) Module 

• provides interface to SPI peripherals and devices 

• fuH-duplex. synchronous serial data transfers (data in, data out, 
and dock signals) 

• PR31100 supplies dedicated chip select and interrupt tor an SPI 
Interface serial power supply 

• &-bH or 16-bit data word lengths for the SPI interface 

• programmable SPI baud rate 

Timer Module 

• Real Time Clock (RTC) and Timer 

• 40-ttt counter (30.517 usee granularity); 
maximum uninterrupted time - 388.36 days 

• 40-bit alarm register (30.517 usee granularity) 

• 16-oil periodic timer (0.868 usee granularity); 
maximum timeout = 56.8 msec 

• Interrupts on alarm, timer, and prior to RTC roll-over 

UART Module 

• 2 Independent MMJuplex UARTs 

• programmable baud rate generator 

• UART-A port used for serial control interface to external IR 
module 

• UART-B port used for general purpose serial control interlace 

• UART-A and UART-B OMA support for receive and transmit 

Video Module 

* 

• bit-mapped graphics 

• supports monochrome, grey scale, or color modes 

• time-based dithering algorithm tor grey scale and color modes 

• supports multiple screen sizes 

• supports split and non-spiil displays 

• variable size and relocatable video buffer 

• OMA support for fetching image data from video buffer 
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Figure 2 shows a typical syslem block diagram cosisling of 
PR3110O and UCB1100 lor a tola! system solution. 
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Version 1.2 
GENERAL DESCRIPTION 

The UCB1100 Is a single chip. Integrated mixed signal audio and 
telecom codec. The single channel audio codec is designed for 
direct connection of a microphone and speaker. The buirMn telecom 
codec can directly be connected to a DAA and supports high speed 
modem protocols. The Incorporated 10 bit anaJogue to digital 
converter and the touch screen interface provides complete control 
and readout of a connected 4 wire resistive touch screen. The 10 
additional general purpose I/O pins provides programmable inputs 
and/or outputs to the system. 

The UCB1100 has a serial Interface bus (SIB) Intended lo 
communicate to the system controller. Both the codec input and 
output data and the control register data is multiplexed on this SIB 
interface. 

APPLICATIONS 

' PersonaJ Intelligent Communicators (PIC)/ 
Personal Digital Assistants (PDA) 

• Screen phones 

• Smart Phone and smart Fax 

• Intelligent Communicators 

KEY FEATURES 

• 48-pln LQFP (SOT313-2) small body SMD package and low 
externa) component count result In minimal PCB space 
requirement. 

• A 12-bH sigma delta audio codec with programmable sample rate, 
input and output voltage levels, capable of connecting directly to 
speaker and microphone, Including digitally controlled mute, 
loopback and clip detection functions 

• A 14-btt slgma delta telecom codec with programmable sample 
rate. Including digitally controlled input voltage level, mute, 
loopback and clip detection functions. The telecom codec is 
intended for direct connection to a DAA (digital access 
arrangement) and includes a built-in sidetooe suppression circuit. 

• A complete 4 wire resistive touch screen Interface circuit 
supporting position, pressure and plate resistance measurements. 

• A 10-bit successive approximation ADC with internal track and 
hold circuit and analogue multiplier for touch screen readout and 
monitoring of lour external high voltage (7.5V) analogue voltages. 

• A high speed, 4 wire serial interface data bus (SIB) for 
communication to system controller. 
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• A 3.3V supply voltage and buill in power saving modes make the 
UCB1 100 optimal for portable and baltery powered applications. 
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Eden OS V.2.0 Overview 

EROS (Eden Real-time Operating System) is a full-featured operating system 
designed from scratch to be: 

Compact: the operating system consumes resources in the form of ROM and 
RAM. in the product, these resources add to the BOM costs of the 
product and any space occupied by the OS must be justified. EROS 
is designed to be small. The modularity is also a feature which 
supports the compactness of the operating system; where individual 
products do not need a feature it can be omitted or replaced by 
some subset, leaving more room for the visible components that 
add features and thus perceived value. 

Open: an open OS will be more likely to attract 3 rd party developers 

looking to design software products for sale, so allowing more 
value in the form of available features to be added to products 
based on the OS. EROS has a published API and a PC-based SDK 
which supports the development of applications in a readily 
available development platform. 

Modular: Each component individually and in many cases sub-components 
may be omitted or replaced without difficulty where their 
functionality is not needed or has to be changed for particular 
products. 

Portable: 99% written in ANSI C, porting to a new processor and/or tailoring 
to a specific product design is sufficiently simple and predictable for 
this to be completely acceptable within a product development 
lifecycle. EROS offers the same application interface on each 
platform, allowing applications to run on any EROS platform. 
EROS application development is carried out on a PC SDK 
incorporating a subset of the target OS. In the medium term, Eden 
will adopt the GNU toolset for the development of EROS itself and 
support this toolset for all targets. 

The overall structure of EROS is shown in the enclosed slide. 
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The components of EROS are: 

Advanced Real-time Kernel (ARK): This is the core of EROS; based on the 
ITRON 3 specification and extended, this supports pre-emptive, prioritised 
multitasking, message queues, semaphores, rendezvous ports, event flags and 
interrupt handling. 

Virtual Memory Management (VMM): Depending on the level of support 
available within the chosen platform, this offers protection against faulty 
applications, mapping of virtual memory onto real memory and supplies the 
dynamic memory handling (mallocO and freeO). 

Eden's Visual Environment (EVE): This offers an object oriented means of 
building up a GUI. EVE implements a core set of simple objects which do not 
impose a 'look and feel' on the OEM and application provider. EVE also supports 
a limited number of compound objects (as the name implies, constructed by joining 
simple objects together). Application writers can easily generate their own 
compound objects to implement the GUI they design. 

Advanced Database Access Module (ADAM): This is a traditional database 
implementation, offering a record structure, insert, delete, search, data integrity 
checks and record locking. It differs from other database implementations by being 
designed to operate in an embedded environment. 

Clipboard Application Interface (CAIN): The EROS clipboard supports copy, 
cut and paste and drag and drop. It does this by allowing applications to set-up 
self-describing data items which can then be passed between applications which 
have no knowledge of each other. 

Generic Object Data System (GODS): EROS' file system is built as a number of 
layers, allowing multiple filing systems to be supported (typically a DOS- 
compatible filing system on PC-cards and a Flash-oriented for built in non-volatile 
storage) without the applications being aware of such details. 

PC card services: EROS supports SRAM, Flash and ATA drives as storage and 
data exchange devices. The PC card services offer a key set of facilities allowing 
support for specific card types to be developed as necessary. 

Device Handling: One of the features of embedded systems is that they often have 
non-standard devices and PC-cards supply loadable devices which may not be 
known at the time the system is first built. EROS' Device Manager supports the 
dynamic addition of device drivers and allows handler tasks to establish a 
connection to whichever is the most appropriate driver. 

TCP/IP: EROS supports TCP/IP, SLIP and PPP. A number of higher levels 
protocols are supported as standard within the OS including UDP, FTP, SMTP, 
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POP3, and HTTP. Other protocols arc supported on a specific product or 
implementation basis. 

Other features supported by EROS include: 

Linking and Loading: Embedded systems are typically provided as a single ROM 
containing the operating system and all the applications. The addition of new 
applications and the correction of those supplied in ROM is difficult. Flash 
memory is used, but the mechanisms for upgrade and addition are usually clumsy. 
EROS makes use of a Dynamic Linker Loader (ELF) to overcome much of this 
difficulty. EROS itself and built-in applications are installed in ROM but their 
external linkage symbols are loaded into RAM during start-up. Patches can be 
installed so that later in the start-up sequence some of these symbols are changed 
to point to new code, thus avoiding the obselete areas of code in ROM. Similarly, 
applications which are loaded dynamically are linked to this symbol table and so 
use the correct built-in and patched code. 

Localisation: The OS structure supports OEMs and application developers in 
providing a framework within which applications can be constructed which are 
easily ported from language to language and from country to country, with little or 
ideally no change to software. 

Power Management: Embedded applications are often battery powered and 
hence power use is critical. While the degree of support offered by particular 
processors and products will vary, EROS supports an API which allows 
applications to be contructed in a power-sensitive manner and supports the specific 
attributes of particular platforms in an appropriate manner. 

Application Interface: Any application program interacts with EROS through the 
Application Program Interface (API). At the programming level these appear as 
function calls. These functions are primarily in the form of 'helpers' which execute 
as part of the application task and exchange information with one or more EROS 
tasks before returning to the application code. Responses and other input from 
EROS are provided by messages sent to the task's input queue or, for so-called 
'blocking' calls, by the helper function using a 'rendezvous' for the exchange. 
Application tasks are usually structured as a single message handling loop which 
takes messages from a message queue. 
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Development Tools: EROS includes a set of tools to enable applications to be 
developed for EROS platforms. Such applications will usually be platform 
(processor) and product independent, subject to appropriate devices being 
available to handle the interfaces. The toolset comprises: 

• a sub-set of EROS which executes in DOS on a PC and provides an 
environment in which most applications can be developed and tested. This 
requires that the developer uses the Borland 4.5 development system. 

• cross-compilers, linker and host-target debuggers are specific to the target 
platform; Eden will recommend these on a platform specific basis but in the 
medium term will primarily suggest and support the GNU tools. 

• a terminal/target monitor program which allows internal details of EROS to be 
examined 

• font and icon editors 

• fiill linking instructions are provided to allow OEMs to build ROM images 
which include EROS and built-in applications 

• full construction details are supplied to allow a patch file to be created 

• full instructions are supplied to allow loadable programs to be produced 

• EROS for the target platform is supplied in the form of shared libraries making 
up the 'helper* functions, object code for the EROS tasks and an initial startup 
sequence to be modified by or on behalf of the OEM. 
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Target hardware and product-specific issues: A very large proportion of EROS 
is hardware and product independent, requiring simply re-compilation to run on a 
new platform. Thus the amount of effort required to tailor EROS to a specific 
processor and product configuration is relatively small. 

The areas usually requiring rework on a per-platform (i.e. per-processor) basis are: 

• basic serial port driving and monitor production. 

• kernel mapping at the lowest level 

• core start-up sequence 

• memory mapping to use the target architecture 

The primary areas where such work is usually necessary on a per-product basis 
are: 

• keyboard, screen and digitiser handling: typically each product uses different 
hardware in these areas, EROS offers a simple interface to program to and 
Eden will do this work if required. 

• memory configuration and start-up: EROS supplies a skeleton start-up 
sequence (above) for each target platform; extending this is a product-specific 
task. 

• Non-standard devices: EROS has a device handling architecture which supports 
the addition of new device handlers. 

• PC-card interfacing: Eden generally has to rework the lower levels of PCMCIA 
card handling to use the particular controller selected. 

• The development version of EROS on the PC requires changes to match the 
screen size of the target product, to support GUI development. 
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Claims:, 

1 1. A portable apparatus for communication of 

2 audio signals in analog and digital form and for storage of the 

3 same, comprising: 

4_ digital storage means; 

5 a communication connection to a communication 

6 channel; 

7 a telecommunications interface having a 

8 communications input and output coupled to said communication 

9 connection and a digital input and output; 

10 an analog-to-digital converter having an output 

11 coupled to said storage means; and 

12 a controller coupled to said storage means and said 

13 telecommunications interface digital input and output and 

14 comprising: 

15 means for detecting whether a signal on said 

16 communication connection is an analog or digital audio 

17 signal; 

18 routing means controlled by said means for 

19 detecting and coupled to said telecommunications 

20 interface, said storage means and said analog-to-digital 

21 converter, upon said detecting means detecting a digital 

22 signal said routing means causing the digital output of 

23 said telecommunications interface to be coupled to said 

24 storage means, upon said detecting means detecting an 

25 analog signal said routing means causing said 

26 telecommunications interface to bypass the signal on said 

27 connection and coupling the 
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1 2. same to said analog- to-digital converter for 

2 subsequent storage in said storage means. 

1 3. The apparatus of Claim 1 further comprising the 

2 coupling to said storage means being effected through a device 

3 which compresses the signal prior to storage. 

1 4. The apparatus of Claim 1, said controller 

2 further comprising: 

3 means for assembling digital messages stored in 

4 said storage means into a packetized data stream 

5 containing data and control bits; and 

6 means for coupling said packetized data stream 

7 to the digital input of said telecommunications interface 

8 for transmission over said communication channel. 

1 5 . An apparatus as in claim 3, wherein said 



2 controller causes said telecommunications interface to transmit 

3 said packetized data stream at a rate that is substantially 

4 higher than the transmission rate of digitized voice. 

1 6 . The apparatus of claim 1 further comprising a 

2 connection to a digital communications channel and an interface 

3 therebetween and said controller. 

1 7. The apparatus of claim 1 wherein said digital 

2 communication channel and the corresponding interface are 

3 designed to handle infrared communications. 
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8 . The apparatus of claim 1 further comprising a 
bar code reader coupled to said controller. 



9. The apparatus of claim 1 further comprising an 
LCD touchscreen coupled to said controller. 

10. An apparatus for communication of audio signals 
in analog and digital form and for storage of the same, 
comprising: 

digital storage means; 

a connection to a communication channel; 

a telecommunications interface having an analog input 
and output coupled to said connection and a digital input and 
output ; and 

a controller coupled to said storage means and said 
telecommunications interface and comprising; 

means for assembling digital messages stored in said 
storage means into a packetized data stream containing data and 
control bits; and 

means for coupling said packetized data stream to the 
digital input of said telecommunications interface for 
transmission over said communication channel. 

11. An apparatus as in claim 9 wherein said 
controller causes said telecommunications interface to transmit 
said packetized data stream at a rate that is substantially 
higher than the transmission rate of digitized voice. 
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1 12. An apparatus as in claim 9 wherein said 

2 controller includes a module for detecting receipt on the 

3 communication channel of a message in HTML language and 

4 permitting two-way communication in said language. 

1 13. An apparatus as in claim 9 wherein said 

2 controller includes a module for detecting receipt on the 

3 communication channel of a message in FTP language and 

4 permitting two-way communication in said language. 

1 14 . An apparatus as in claim 9 wherein said 

2 controller further comprises a speech synthesizer responsive 

3 to receipt of text information over said communication channel 

4 to produce an audible message simulating said text information 

5 being spoken by a human voice . 

1 15 . An apparatus as in claim 9 wherein said 

2 controller further comprises a database management module for 

3 receiving information about stored data and permitting 

4 selective retrieval of said information. 

1 16. A method for communication of audio signals in 

2 analog and digital form over a communication channel and for 

3 storage of the same, comprising the steps of: 

4 detecting whether a signal on said channel is an 

5 analog or digital audio signal; 

6 upon detecting a digital signal on said channel, 

7 storing in a digital storage means the output of a 
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8 telecommunications interface of the type having an input 

9 coupled to said channel and a digital output; 

10 upon detecting an analog signal on said channel, 

11_ converting the same from analog to digital form and storing the 

12 converted signal in a digital storage means. 

1 17. The method of Claim 15 wherein prior to either 

2 of said storing steps said signal is compressed. 



1 18 . The method of Claim 15 performed with a 

2 telecommunications interface of the type having an analog input 

3 and output coupled to said channel and a digital input and 

4 output and further comprising the steps of: 

5 assembling digital messages stored in said 

6 storage means into a packetized data stream containing data and 

7 control bits; and 

8 coupling said packetized data stream to the 

9 digital input of said modem for transmission over said 

10 communication channel at a rate that is substantially higher 

11 than the transmission rate of digitized voice. 



1 19. A method for communication of audio signals in 

2 analog and digital form over a communication channel and for 

3 storage of the same, said method being performed with a 

4 telecommunications interface of the type having an analog input 

5 and output coupled to said channel and a digital input and 

6 output and comprising the steps of: 
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7 assembling digital messages stored in a storage means 

8 into a packetized data stream containing data and control bits; 

9 and 

10 coupling said packetized data stream to the digital 

11 input of said modem for transmission over said communication 

12 channel at a rate that is substantially higher than the 

13 transmission rate of digitized voice. 

1 20. A portable device which permits the user to 

2 record, edit, play and review voice messages and other audio 

3 material which may be received from, and subsequently 

4 transmitted to, a remote apparatus through a communication 

5 link, comprising: 

6 a receptacle for a power source ; 

7 integrated circuitry for localized recording, 

8 editing, storage and playback of audio signals powered from 

9 said receptacle; 

10 non-volatile storage means, access to which is 

11 controlled by said integrated circuitry ,- 

12 a built-in speaker and microphone coupled with said 

13 integrated circuitry for audible playback and local input, 

14 respectively, of audio; 

15 a telecommunications interface chip set coupled with 

16 said integrated circuitry; 

17 a modular telephone jack coupled to said modem chip 

18 set; 

the integrated circuitry operating the device so as 
to transmit and receive audio signals at a rate substantially 
faster than originally recorded. 
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1 21. A device in accordance with claim 19 wherein 

2 said integrated circuitry includes a module that is operative 

3 to permit distinguishing between analog and digital signals 
4_ received on the communication link, the analog signals being 

5 presented to said integrated circuitry without being processed 

6 by said telecommunications interface chip. 

1 22. A device in accordance with claim 19 wherein 

2 said integrated circuitry includes a module permitting 

3 communication via said communication link over the internet 

4 utilizing at least one protocol available thereover. 

1 23. A device in accordance with claim 19 wherein 

2 said integrated circuitry includes a module that recognizes a 

3 signal received over the communication link as text and 

4 converts the signal to a signal emulating the sound of a human 

5 voice speaking the text. 
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(57) Abstract 

A portable device is disclosed which permits the user to record, edit, play and review voice messages and other audio material which 
may be received from, and subsequently transmitted to, a remote voice processing or interactive voice (IVR) host computer over a 
communication link. A preferred device contains its own power source, integrated circuitry (II) and control buttons to permit the localized 
recording, editing storage and playback of audio signals through a built in speaker (18), microphone (26) and removable memory card. The 
device also contains a standard RT-1 1 telephone jack (20), modem chip and control of audio signals to and from a host computer. The device 
contains circuitry which permits it to transmit and receive audio signals at a rate substantially faster than originally recorded. 
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